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Abstract 

In this paper, we discuss the estimation of random errors due to shot noise in backscatter 
lidar observations that use either photomultiplier tube (PMT) or avalanche photodiode 
(APD) detectors. The statistical characteristics of photodetection are reviewed, and 
photon count distributions of solar background signals and laser backscatter signals are 
examined using airborne lidar observations at 532 mn using a photon-counting mode 
APD. Both distributions appear to be Poisson, indicating that the arrival at the 
photodetector of photons for these signals is a Poisson stochastic process. For Poisson- 
distributed signals, a proportional, one-to-one relationship is known to exist between the 
mean of a distribution and its variance. Although the multiplied photocurrent no longer 
follows a strict Poisson distribution in analog-mode APD and PMT detectors, the 
proportionality still exists between the mean and the variance of the multiplied 
photocurrent. We make use of this relationship by introducing the noise scale factor 
(NSF), which quantifies the constant of proportionality that exists between the root- 
mean-square of the random noise in a measurement and the square root of the mean 
signal. Using the NSF to estimate random errors in lidar measurements due to shot noise 
provides a significant advantage over the conventional error estimation techniques, in that 
with the NSF uncertainties can be reliably calculated from/for a single data sample. 
Methods for evaluating the NSF are presented. Algorithms to compute the NSF are 
developed for the Cloud-Aerosol Lidar and Infrared Pathfinder Satellite Observations 
(CALIPSO) lidar and tested using data from the Lidar In-space Technology Experiment 
(LITE). 

OC/S Codes: 280.3640 (lidar), 040.5160 (photodetectors), 270.5290 
(photon statistics) 
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1. Introduction 


Lidar or laser radar has been used for atmospheric remote sensing since early the 1960s 
to measure important atmospheric parameters (wind, temperature) and constituents such 
as aerosols, clouds, trace gases, etc. Accurately estimating and accounting for the 
measurement errors (or uncertainties) introduced by various lidar system components is 
an important issue that must be addressed in order to ensure the reliable application of 
lidar data products to atmospheric studies. Well-established error-propagation theory 1 is 
usually used in the error analysis of backscatter lidar observations. Based on this theory, 
an algebraic expression” can be derived that computes the total uncertainty as a function 
of the various error sources. However, application of this expression requires estimates 
of the uncertainties attributable to each significant source. 

There are two major types of uncertainty in lidar observations: random errors and bias 
(systematic) errors. Random errors are generally caused by random fluctuations (or 
noise) inherent in the measurement. For backscatter lidar measurements, these random 
fluctuations result primarily from: (a) quantum noise (also known as shot noise) due to 
the discrete nature of the incident light, charge carriers, and the interaction of light with 
the photodetector (i.e., photoemission); (b) thermal noise due to the random motion of 
electrons arising within the photodetector, load resistor and amplifier, and other noise 
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sources (e.g., 1/f noise, etc. ); and (c) excess noise introduced in the multiplication 
process when a photomultiplier tube (PMT) or an avalanche photodiode (APD) is 
operated in the analog detection mode. Random errors can be reduced by averaging or by 
repeating the measurement. Systematic errors, on the other hand, generally arise from 
sources such as inaccurate calibration, nonlinearities in the photodetector response, 
defects in optical components, and/or a systematic electronic noise. This type of error 
can produce a fixed amount of bias that cannot be reduced by averaging. In contrast to 
the random error, however, it is sometimes possible to reduce the effects of systematic 
errors when their sources are known. As the focus of this paper is random error, we will 
not be discussing systematic errors in any further detail. 

In lidar observations, the noise arising from background radiation and detector dark 
current, but excluding those fluctuations due to the scattering signal, is generally referred 
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to as the background noise. Background noise is easily measured and is independent of 
range from the laser transmitter. The standard deviation of the background signal can be 
determined, for example, from the samples acquired before firing the laser (i.e., when 
there is no backscattered signal), or from the samples corresponding to very high altitudes 
(e.g., > 40 km) where the laser backscatter is negligibly small when compared to the 
magnitude of the background signal. In contrast to the background noise, the magnitude 
of the noise associated with the scattering signal depends on the range-resolved intensity 
of the backscattered light, and thus needs to be estimated separately for each data sample. 
In this paper we will focus our discussion on this latter type of error, and on methods for 
estimating its magnitude. 

For lidar measurements, the conventional method widely used to estimate the random 
error is to compute the standard deviation of a series of consecutive samples. These 
samples can be obtained either vertically, from sequence of consecutive range bins within 
a single lidar profile, or horizontally, from samples at the same range bin obtained over 
some number of consecutive profiles. When using these statistical techniques, however, 
the natural variability of the atmosphere can cause significant overestimates of the 
random component of the measurement error. This effect is especially severe in those 
areas where the atmospheric composition changes rapidly (e.g., within clouds). Given 
that measuring the variability of the atmosphere is one of the fundamental objectives to 
be realized by the use of backscatter lidar observations, it is thus highly desirable to have 
an error estimate that can be generated in a manner wholly independent of the ambient 
atmospheric content. 

In this paper, we introduce the noise scale factor (NSF) to estimate the random error due 
to signal shot noise. The derivation of the NSF is based on the fact that when the 
intensity of an incident light field does not fluctuate during the time of observation (i.e., 
when it remains in a statistically stationary state), photons sampled during this time will 
follow a Poisson stochastic process. 4, 5 In Section 2 of the paper we review the statistical 
basis of photodetection. The mathematical derivation of the NSF is presented in Section 
3. Practical techniques for ascertaining the correct value for the NSF are developed in 
Section 4. This development is illustrated via application to the lidar that will fly aboard 
the Cloud-Aerosol Lidar and Infrared Pathfinder Satellite Observations (CALIPSO) 
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satellite 6 , and tested using data acquired during the Lidar In-Space Technology (LITE) 
mission 7 . Issues of transferring the NSF from one signal domain to another, and concerns 
arising from averaging partially correlated samples, are discussed in Section 5. 
Concluding remarks and a summary are given in Section 6. 


2. Statistics of Photodetection 


a. Shot Noise 


PMTs and APDs, operating either in a photon-counting mode or in an analog mode, are 
the standard photodetectors used for backscatter lidar observations. We will therefore 
focus our discussions on the statistics of photodetection using PMTs and APDs. 

Even if the radiation field is of constant intensity, the number of photons arriving at the 
photodetector during any time increment is inherently uncertain due to the quantum 
nature of light. Straightforward, statistical proofs exist showing that if photon arrival 
rates are time-independent (i.e., they can be described as being a statistically stationary 
process), the total number of photons arriving during any time interval t is Poisson- 
distributed. 4 Theoretical studies have established the correspondence between the 
number of photons incident on the detector and the number of photoelectrons emitted, 
and thus the photoelectrons also have a Poisson distribution. ^ The probability of emitting 
n v photoelectrons during time r is given by 


P(n p ) = 



( 1 ) 


In this expression, n p = tijPI hv represents the mean number of photons emitted. P is 

the power of the incident field, 77 is the quantum efficiency of detector, h represents 
Planck’s constant, v describes the frequency of the field, and hv is the energy of the 
photon. Both here and afterwards, an overbar (e.g., n ) is used to indicate that a quantity 

represents a mean or average value. For a Poisson distribution, the variance is equal to 
the mean, so that 

A »p 2 = n p , (2) 
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where A n represents the standard deviation. The variance quantifies the uncertainty in 

the measurement due to shot noise. The Poisson distribution applies to light emitted from 
an ideal laser having deterministic intensity, or from a thennal radiation source such as 
the sun that has a coherence time r c much smaller than the sampling time r. 5 Examples of 
the photon count distributions of solar background signals and laser scattering signals are 

o 

given in figure l Figure 1 . These data were acquired by the Cloud Physics Lidar (CPL), 
which is an airborne, down-looking system that uses photon-counting detection and can 
thus provide direct photon count measurements. The data shown in figure l Figure 1 was 
obtained from daytime measurements, where detector (APD) dark counts are negligibly 
small when compared to the background light signal. The background signal distribution 
in figure l Figure 1 (a) was compiled using 100 subsurface samples (i.e., signals 
containing no laser backscatter) from 1000 profiles (a total of 100,000 samples). Each 
CPF raw sample is acquired in a counting time period of 0.2 ps (corresponding to a 30 m 
vertical resolution) and accumulated over 500 shots. This results in an effective counting 
time of 0.1 ms for each raw sample. The composite atmospheric scattering distribution 
(laser backscatter + background signal) shown in figure l Figure 1 (b) was derived using 
six samples (range bins) from ~10 km in the 1000 profiles (a total of 6,000 samples). For 
comparison, Poisson distributions having the same means as the measured data are also 
shown in each panel, and it is clearly seen that both the laser scattering signal and the 
solar background signal share the same type of distribution - Poisson. 


In general, if the radiation field intensity varies with time, the photodetection statistics are 
governed by a compound Poisson distribution (also known as Mandel’s formula), 5, 9 
whose rate density is proportional to the instantaneous electromagnetic energy collected 
by the detector. In this case, the variance is given by 

(a ,7 >) 2 = n p + { J ll hv) 2 AW 2 , (3) 

where W is the integrated optical intensity over time interval r and AW is the standard 
deviation of W. The additional term in the expression for the variance, (// / AW 2 , 
results from the field fluctuations of the incident radiation. This term, which in 
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photodetection of thermal light is sometimes called “photon-bunching noise”, is a 
consequence of the correlation of fluctuations in the thermal light intensity. 5, 9 In 
backscatter lidar observations, this excess noise may arise from fluctuations in the laser 
source and/or the natural variation of the atmospheric scattering media. Fluctuations of 
laser output are usually small, and are monitored during the observations in order to 
energy-normalize the lidar data prior to subsequent analyses. As a result, the effect of 
laser fluctuations is ignored in our analysis. However, the variability of the atmosphere 
and its components, especially clouds, can be very large. As mentioned above, 
characterizing this atmospheric variability is one of the primary objectives of backscatter 
lidar measurements. We therefore do not include an atmospheric variability term in our 
random uncertainty estimates. 

b. Excess Noise (Multiplication Noise) 

For a PMT or an APD operated in the analog detection mode, the output electrons 
(multiplied photoelectrons) at the anode do not obey Poisson statistics, even if the 
incident photons (or emitted photoelectrons) do. 10-12 This is because the photoelectron 
multiplication in these detectors is also a stochastic process, which can introduce an 
excess noise. In a typical PMT, the photoelectrons emitted from the photocathode are 
multiplied by a set of dynodes via the secondary emission of electrons. The probability 
distributions of the multiplication gains of these PMTs can be described by a multiple 
stochastic (compound) Poisson distribution 10, 12 . In APDs, on the other hand, 
photoelectrons can initiate impact ionization to produce extra hole-electron pairs, which 
in turn result in more hole-electron pairs as they move through the space-charge region 
(avalanche region or multiplying region). The photocurrent is thus multiplied. For a 
uniform APD having a thick multiplying region, the probability distribution of gains can 
be characterized analytically by a local-field theory. 11 

The variance of multiplied electrons can be expressed as 10-12 

(A n f =F G n , (4) 

V m ) m m m 9 V/ 
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where n m is the number of multiplied electrons and n m is the mean number of such 
electrons. n m is detennined by 


where G m is the average gain of the multiplication and n p is the mean number of photons 

incident on the detector. The F m term in Eq. (4) represents the excess noise factor that is 
used to quantify the extra noise caused by the variability of the multiplication gain in a 
PMT or APD. The excess noise factor is a function of the average gain, G m , for both 
PMTs and APDs 10 ' 12 . For PMTs, F m normally ranges from 1 to 2, and decreases as G m 
increases. For APDs, F m increases with increasing G m and is normally larger than 2. The 
larger excess noise introduced in the APD is due to the greater uncertainty of the APD 
multiplication gain. The APD gain variation arises from two sources: (1) the randomness 
in the locations at which ionizations may occur, and (2) the feedback process associated 
with the fact that both electrons and holes can produce impact ionizations as they move in 
opposite directions. In contrast, in standard PMTs only one carrier - electrons - causes 
secondary emissions (or multiplication), and this occurs only at fixed locations 
(dynodes). For PMTs having identical gain factor m for each dynode, the excess noise 
factor is given by 1012 


F_ 


m 

m - 1 


(6) 


In this case, G m =m N , where N is the number of dynodes. For uniformly multiplying 
APDs, 11 the excess noise factor is 


F =k-G m +(\-k) 

m m \ ) 


2 Gj 


(?) 


where k is the ratio of ionization coefficients due to holes and electrons. As an example, 
F m =\ .5 for PMTs when m= 3, and F m = 5 for APDs when /c=0.03 and G m =100. 

3. Noise Scale Factor (NSF) 

As shown by the above discussion, there exists a proportional relation between the 
variance and the mean of the shot noise for both PMTs and APDs operated in either a 
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photon-counting detection mode or an analog mode. Based on this proportionality, we 
introduce the noise scale factor to estimate the standard deviation. Ax, of the shot noise in 
a measurement x from its mean, x , using 

Ax = NSF -x 1/2 . (8) 

NSF has units of the square root of the units for x. For lidar observations using photon 
counting (e.g., Refs. 8 and 13), the random error due to shot noise can be estimated from 
the number of photon counts based on Eq. (2). In this case, NSF = 1 (counts “) in the 
photon-counts domain. For the analog detection, the NSF in the multiplied-photoelectron 
domain is given by 

NSF = (F m -G.) u \ (9) 

For lidar observations, the data is normally sampled using a digitizer. In the digitizer- 
readings domain, NSF can be derived from the signal-to-noise ratio analysis for the lidar 
measurements (see, e.g., Ref. 14), and is computed using 

NSF = (2eBF m G m G A f. (10) 


Here e is the electron charge, B ~ 1/2 ATo is the spectral bandwidth of the lidar receiver, 
and ATo is the integration time. Ga is a gain factor that converts the anode current of the 
detector to digitizer counts, with the assumption that linear amplifiers are used. Ga is a 
product of a number of converting/scaling factors and gains. 

In practice, some amount of background signal, arising from the background radiation, 
detector dark current, etc., is unavoidably included in the lidar measurements. Thus each 
digitized sample, V, can be written as V = V s + Vb, where V s represents the laser 
backscatter signal and Vb represents the background contribution. The overall random 
uncertainty for each sample is therefore the sum of the uncertainties in each of these 
quantities: 


AV = 


a«f 2 k, + (af s ) ! 




( 11 ) 
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In this expression V is the mean of the scattering signal, and Aff is the background 

noise; i.e., the standard deviation of the background signal. A kb can be measured directly 
from the samples where there is no laser scattering signal (e.g., subsurface samples or 
very high- altitude samples). Generally V s is unknown. However, if the measurement is 
not very noisy, the uncertainty can be estimated from a single sample using 


AV 


NSF 2 V,+( AV t f 


“ 1 1/2 


( 12 ) 


Note that, in practice, V s is typically derived by subtracting the measured mean value of 
the background signal, V b , from the raw digitizer reading V; i.e., V s = V -V h . When 
computed in this manner, V s is also a random variable, and thus an additional uncertainty, 
AV h , which represents the uncertainty in the measured V h , must be introduced into the 
calculation; that is, 


AV ■ 


NSF 2 V,+(AV b ) 2 +(AV l f 


— 1 1 / 2 


(13) 


AV h is usually determined by computing the standard deviation of a number of samples 
where there is no scattering signal and V b is the mean of these samples. Therefore, the 
error in the estimate of the mean is 


AV, = 




AK 


b 9 


(14) 


where A'b is the number of the samples from which V h is computed. This number is 
usually quite large, so that AV h is typically much smaller than AV h . 

The advantages of using the NSF to estimate the uncertainties inherent in lidar back- 
scatter measurements are illustrated by the CPL profile measurements shown in Figure 
2 Figure 2 . To derive the conventional error estimates, standard deviations (with respect 
to the mean signals) have been computed for each altitude bin between 0-km and 16-km 
for a sequence of 100 consecutive profiles. These values are plotted using a dashed line. 
For comparison, standard deviations estimated from a single profile using the NSF 
technique are plotted using a solid line. The uncertainties computed using the two 
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methods are generally consistent in the aerosol-free region above ~1.5 km, where only 
molecular scattering exists. However, in the aerosol layer between 0-km and 1.5-km, 
significant overestimates appear in the uncertainties computed using the conventional 
method. This behavior is due to the horizontal variation of the particle concentration 

within the aerosol layer (i.e., the implicit inclusion of the A W 2 tenn in Eq. (3)). This 
comparison clearly shows that the conventional method can overestimate the random 
error. More importantly, a number of horizontally homogeneous profiles are required in 
order to derive accurate results using the conventional method. On the other hand, the 
NSF method can estimate the random error using only a single sample. 

4. NSF Measurement 

When the parameters in Eq. (10) are all known, the calculation of NSF is straightforward. 
Ga and B can be detennined accurately based on laboratory experiments, and they 
generally do not vary during the observation period. G m and F m however may vary 
during the observation period, in concert with changes in the lidar operating environment. 
An example where this situation can be expected to occur is provided by the Cloud 
Aerosol Lidar with Orthogonal Polarization 6 (CALIOP) that will fly aboard the 
CALIPSO satellite. CALIOP (pronounced as “calliope”) is a satellite-borne, two- 
wavelength (532 mn and 1064 mn), polarization-sensitive (at 532 mn) lidar that, 
following its launch in early 2006, will conduct continuous observations of the 
atmosphere from space for three years. Two PMTs and one APD operated in analog 
mode are used to detect the two 532-nm polarization signals and the single 1064-nm total 
signal. The gains (and consequently excess noise factors) of these detectors (especially 
the PMTs) may change significantly during the course of the three-year mission. They 
may also change considerably during the launch phase from the ground to space, due to 
the severe vibration and huge change in temperature. Consequently, the NSF must be 
monitored constantly during on-orbit operations in order to make use of this factor in 
estimating contributions to the signals from random noise. This section discusses 
techniques for measuring NSF using the solar background signals, and develops 
operational algorithms for use by the CALIPSO lidar. Because the CALIOP detectors are 
operated in analog mode, and because, in general, NSF = 1 (counts ) for both PMTs and 
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APDs operated in the photon counting mode, we focus our discussion on the 
measurement of NSF for the analog detection mode of the two detector types. 

As of this writing, CALIPSO has yet to be launched, hence we illustrate the algorithm 
development discussion using data acquired by LITE. 7 LITE was the world’s first space- 
home lidar, a three-wavelength backscatter system that flew aboard NASA space shuttle 
flight STS-64 in September of 1994. figure 3 Figure 3 presents an example of a single- 
shot lidar profile measured at 532 nm during the nighttime portion of LITE orbit 117. 
Like CALIPSO, LITE used a PMT for the 532-nm measurements and an APD for 1064- 
nm channel 7 . The LITE data system acquired 5500 range-resolved samples per profile at 
a 10 MHz sampling rate (i.e., 15 meters per range bin). The background signal (DC 
component) was measured and recorded onboard by a background monitor, and 
automatically removed from each profile prior to digitization. In the figure, the return 
signal below ~40-km is seen to increase with decreasing height. This increase is due to 
the increasing atmospheric molecular number density and the greater incidence of 
suspended particles (aerosols). The scattering signal from the upper atmosphere (> ~40 
km) is very small compared with the background signal (background radiation and dark 
current, etc.). 

For a PMT in the analog mode, the dark noise is generally negligibly small when 
compared with the solar background noise during daytime measurements. For an APD in 
the analog mode, the dark noise (which is predominantly amplifier noise 7 ) is dominant 
during nighttime measurements and is comparable to the solar radiation noise during 
daytime measurements. The NSF can then be derived, based on Eq. (8), using 


NSF = 



(15) 
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when the solar radiation noise is dominant (i.e., daytime measurements), and using 


NSF = 


(A^) 2 -(AC ,) 2 


1/2 


/ — — \ 1/2 

N-N 


(16) 


11 


when the dark noise cannot be ignored (nighttime measurements). In these equations, 
AV b and AV a represent the RMS noise of the total background signal and the component 
due to detector dark current, and V b and V d are their means, respectively. We note that 
Eq. (15) is an approximation of Eq. (16), valid only when the dark noise is negligibly 
small. In the following subsections, methods for computing each of these quantities are 
described. In addition, test results derived using LITE measurements at both 532 nm and 
1064 nm are presented for both PMTs and APDs, and are discussed in detail. 


a. NSF Estimation for PMTs 

The RMS noise and the mean of the solar background signal must be derived in order to 
compute NSF using Eq. (15). The RMS background noise is estimated by calculating the 
standard deviation over a large number of samples in each profile, selected from a region 
where the laser scattering signal is negligibly small (i.e., above —40 km; refer to figure 
3 Figure 3 ). For LITE and CALIPSO, the background signal is (or, for CALIPSO, will 
be) derived by converting the background monitor reading from its native units into 
equivalent science digitizer counts. In ± Figure 4 Figure 4 (a) we present the square root of 
the background signal and the RMS noise derived from the high altitude region for LITE 
measurements at 532 nm (i.e., PMT detection) acquired during orbit 1 17. It is shown that 
the solar radiation background dominates the background signal for the daytime portion 
of the orbit. The NSF values derived using Eq. (15) are shown in jfigure 4 Figure 4 (b). It 
is seen that the NSF is generally constant for the daytime portion of the orbit. However, 
at profile number 2200 there is a step change of — 10%, which is most likely due to an 
undocumented change in the PMT gain. The sudden spike in NSF values (to —10) for the 
profiles from -2000 to 2200 is due to the saturation of the background monitor digitizer, 
which can be seen in the flat-line segments of the square-root curve in jfigure 4 Figure 
4(a). The NSF values calculated for the nighttime portion are generally smaller than that 
for the daytime portion, where the detector dark noise contributes significantly to the 
background noise. However, for those regions where lunar light (backscattered from 
dense clouds etc.) dominates the background signal, the NSFs have values similar to 
those computed during the daytime portion. The nighttime NSF values are also 
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substantially noisier, due to the very low levels of background illumination combined 
with the limited resolution of the LITE background monitors. 

b. NSF Estimation for APDs 

Due to the presence of large amounts of dark noise, NSF measurement for an APD in 
analog mode is relatively complicated. The APD dark current represents the dominant 
noise source in the nighttime portion of the data, and is comparable to the solar 
background signal for the daytime portion. The computational difficulties arising from 
this situation are illustrated in the sequence of plots shown in figure, 5 Figure 5 . The 
upper panel (figure , 5 Figure 5 (a)) shows the square root of the 1064-nm background 
monitor reading and the RMS noise of the background signal at 1064 mn, computed over 
the same data segment shown in figure 4 Figure 4 . The RMS noise is once again 
estimated using the samples above —40 km, where the laser backscatter is negligible. 
figure 5 Figure 5 (b) shows the NSF estimates that would be computed without first 
correcting the measurements for dark components; i.e., by using Eq. (15) rather than Eq. 
(16). The large NSF oscillations seen in the daytime segment of the APD data compare 
poorly with the consistent results obtained using the PMT measurements, and are a direct 
consequence of the dark noise contributions from the APD and the amplifier. 

The APD NSF computed for the daytime portion using Eq. (16) with the dark 
components removed is presented in Figure 5 Figure 5 (c). The mean value, V d , and RMS 
noise, AV d , of the dark current were detennined from the nighttime portion of the data, 
under the assumption that these quantities do not change significantly in the transition 
from nighttime to daytime observations. The computed NSF using Eq. (16) is generally 
constant. However, very large variations appear in regions where the solar background 
signal is quite small compared with the dark current. This is because, when such 
conditions occur, the magnitude of V b becomes very similar to that of V d in denominator 

of Eq. (16), and the uncertainties in their determination become bigger than the difference 
of their average values. These near-zero values in the denominator give rise to the very 
noisy behavior of the NSF estimate computed via Eq. (16) and seen on the left-hand side 
of figure , 5 Figure 5 (c). To stabilize the calculation of the NSF, a modified form of the 
equation is derived, such that 
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(17) 


NSF= AVb 


y/K + ~c’ 


where c is a constant that satisfies 


c = F„ 


r AV?/V d _^ 


NSF 


J 


(18) 


under the assumption that the NSF and A Vj / V d do not change for the chosen data 

segment, c is chosen by trial so that the NSF curve is flattest over the entire data 

segment. The NSF determined according to Eq. (17) is also presented in figure 5 Figure [ Formatted: Fc 

A(c). It is seen to be constant over the entire data segment, with a mean of 1.39, and is 

generally consistent with the NSF computed using Eq. (16). This modified approach 

appears to be much less sensitive to noise when the background levels are low. 


5. NSF Application Issues 

a. Transferring NSF 

The value of NSF is signal domain dependent. The formula for a linear transfonn of NSF 
from a domain V to another domain V' = K ■ V is given by 

NSF v ,=K U 2 NSF v . (19) 

if is a conversion factor independent of V or V’. The derivation of this formula is 
straightforward. As an example, the application of this formula to the lidar measured 
attenuated backscatter coefficients, which are a fundamental lidar product, is discussed 
below. 

Raw lidar measurements are usually further processed in order to produce additional 
meaningful data products. The attenuated backscatter coefficients, , are derived by 

range-correcting and scaling the background-subtracted samples, V S =V - V h , as follows: 
PXr) = p(r)T 2 (r)Nr t (r). (20) 
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Here J%r) is the atmospheric backscatter coefficient (including both molecular and 
particulate contributions) at range r; T is the atmospheric transmittance, which accounts 
for signal attenuation between the lidar and the volume of atmosphere at range r; and C is 
the lidar calibration constant. V h is the measured background signal, which is usually 

determined from the mean of the subsurface samples where there is no laser scattering 
signal (e.g., as for CPL and other down-looking lidars) or from the samples acquired at 
high altitudes (e.g., 65-80 km for the CALIPSO lidar) where the laser scattering signal 
due to the atmospheric molecules and particles is negligibly small. The uncertainty in /?' 
due to shot noise can be estimated using 


2 r- 


A J3' = — 
H CL 
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where AV b = A V b / , and M, is the number of the samples used to compute V h and 

AV b . NSF V is the noise scale factor in the V domain and NSF fr = (r 2 / C) 12 • NSF V (refer 
to Eq. (19)) is the noise scale factor in the /?' domain. 

Note that, however, NSF is not constant in some domains. For example, in the P'{r) 

domain NSF is a function of r. In practice, it is usually more convenient (and less error- 
prone) to derive and apply the NSF in a domain in which its value is constant. 

b. Samp I e A verage 

To produce high quality lidar data products, signal averaging over a number of range bins 
or over a number profiles (laser shots) is usually required. (Note, however, that while 
averaging is an effective way to reduce noise, as a trade-off it also degrades the resolution 
of the data.) When the samples are totally uncorrelated and N samples are averaged, the 
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RMS noise (standard deviation) can be reduced by a factor equal to the square root of N. 1 ’ 
10 ’ 11 Therefore, if the samples used in averaging are totally independent (uncorrelated), 

Nshot N bi „ 

the random error due to noise in an averaged measurement, V avg = LEVA-A.). 


is estimated by 


j = 1 


AVavg [n~\n 

^ l iy shot l iy l 


, 1/2 


bin 


NSF 1 -V^+(AV ha J +(A 


(23) 


where Nbm and A s hot are the number of range bins and laser shots, respectively, used to 

/ x 2 / x 2 

compute the average; i.e., (A V bavg ) =2^{ AV b j) /N sho, and 
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K„») ! = 2K,T/)v; 


— \2 


shot * 
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c. Correlation Correction 

Lidar design considerations (e.g., bandwidth and sampling frequency) may lead to the 
acquisition of samples that are partially correlated with neighboring samples. For 
example, the sampling interval of the LITE data is 15 meters, while the fundamental 
range resolution of the system is limited by the bandwidth of the lidar receiver (amplifier) 
to a resolution slightly greater than 30 meters (i.e., more than two sample intervals). As a 
result, neighboring samples (2~3 bins) in a LITE backscatter profile are partially 
correlated. Figure 6 Figure 6 (a) shows the autocorrelation function derived from the LITE 
Orbit 117 measurements. The calculation was restricted to the uppennost 2500 samples 
(i.e., data from above 40-km, where atmospheric backscatter is negligible), and averaged 
over 6000 profiles, so that the backscattered solar signal is essentially constant. The plot 
clearly shows that each LITE sample is at least partially correlated with the two samples 
before or after it. 

Though the RMS noise is expected to reduce by a factor of N~ ~ when N independent 
samples are averaged, if the samples are partially correlated the correct expression for the 
relationship described by Eq. (23) becomes more complicated. To illustrate this, figure ( Formatted: Fc 
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presents the standard deviation as a function of number of range bins used to 
compute the average. All values were computed from the same data segment of the LITE 
orbit 117 measurements. For comparison, the standard deviation predicted by the K ~ 
relation is also presented, figure 6 Figure 6 (b) clearly shows that the actual reduction of 
noise is not as large as would be predicted by the N ~ relation. This is due to the partial 
correlation between the neighboring samples, as demonstrated in figure 6 Figure 6 (a). 
The ratio of the measured standard deviation curve to the N curve is presented in 
figure 6 Figure 6 (c) (dashed line). This ratio is larger than 1.5 when the number of 


samples averaged is larger than 10. 


When using correlated data, the difference between the measured and predicted values of 
AF can be significant. Therefore, when using the NSF to estimate random error in 

averaged measurements, a correction is required to compensate for effects of sample-to- 
sample correlation. Introducing the correlation correction function/ Eq. (23) can be 
modified as 


AF„„„ = ■ 


f(N hm ) 


, 1/2 


(N shol ) 


N , 


bin 


NSF 1 -V^ + (AV l:a J +(A 


(24) 


Note that signal averaging does not reduce A V h avg , and that the samples acquired from 

different laser shots are uncorrelated, so that a correction for averaging over multiple 
profiles is not necessary. 


The / function can be either measured directly (i.e., the dashed curve in figure 6 Figure 
6(c)) or computed from the autocorrelation function using 


f(N bm ) = 


l + 2£ 


Nhi "~ ir N bin -m A 

m = 1 y N bin J 


nl/2 


R(m ) 


(25) 


Here R is the autocorrelation function, as shown in figure 6 Figure 6 (a). Values of / 
computed using Eq. (25) are also plotted (solid curve in figure 6 Figure 6 (c)), and are 
generally consistent with the measurements (dashed curve) when small numbers of 
samples are averaged. Analytically derived values of/ are smaller than the measurements 
for large averages, due probably to systematic errors such as the baseline ripple and/or 
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other electronic oscillations imposed in the measurement 7 . The “photon-bunching 
noise” 5 (i.e., the first term on the right hand side of Eq. (3)) arising from fluctuations of 
the emission rate of dark counts (a thennal emission process) in the PMT and of the 
backscattered solar signal intensity due to the lightly variability of the underlying 
atmosphere may also contribute to this discrepancy. The difference, however, is 
acceptably small (< 3%). 


6. Summary 

In the analysis of lidar data, there are two types of errors (uncertainties) that must be 
considered: random and systematic. This paper focuses on the estimation of random 
errors in the received signal due to noise inherent in the backscatter lidar measurement. 
The statistical characteristics of photodetection using both photomultipliers and 
avalanche photodiodes have been reviewed. In general, the distribution of sampled 
photons (photon counts) is a doubly stochastic (compound) Poisson distribution. The 
multiplication process in a PMT or an APD is a stochastic process, and hence generates 
excess noise. Consequently, the multiplied carriers (electrons for PMT and electron-hole 
pairs for APD) no longer follow the Poisson statistics even if the incident photons are 
Poisson distributed. For both PMT and APD, however, there still exists a proportional 
relation between the standard deviation (RMS noise) and square root of the mean of 
multiplied carriers. Based on this fact, the noise scale factor (NSF) has been introduced 
to estimate the random error due to the shot noise. The use of NSF greatly facilitates the 
random error estimation; it allows an estimate of random error for each individual sample 
in the lidar backscatter profile. The traditional method widely used for estimating 
random error computes statistics from an ensemble of lidar measurements, and its 
application thus requires a large number of samples. Furthennore, as shown in this work, 
when applying the conventional technique, an overestimation of the random error 
frequently results from the natural atmospheric variability. This bias error is especially 
acute in the measurement targets of greatest interest, such as boundary layer aerosols and 
clouds. 
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Background noise is another important error source. This error, however, can be 
measured directly; it can be determined from subsurface samples where no scattering 
signals exist, or from samples acquired at very high altitudes (e.g., 40 to 80 km) where 
the scattering signal is negligibly small. Two major components - background radiation 
signal and detector dark current - are included in the background signal. The 
distributions of these signals have also been investigated in this paper. The analysis 
using the CPL measurements at 532 mn, which used photon-counting detection, showed 
that the photon counts due to the solar radiation follow the same statistics as the photon 
counts due to the laser scattering; i.e., Poisson statistics. Based on statistical 
characteristics of the Poisson distribution, algorithms have been developed for the 
CALIPSO lidar that use the solar radiation background signal to determine the NSF of 
the analog modes for PMTs and APDs. The algorithms to compute NSF from the solar 
background signal have been tested with the LITE data. It was shown that the NSF 
measurement for the PMT is largely unaffected by the dark current, because the dark 
current is very small when compared with the solar background signal. The NSF 
measurement for the APD, however, is significantly affected by the presence of dark 
current, because the dark current is large and may, in the presence of significant amplifier 
noise, behave statistically different from the optical signal. When computing the NSF for 
the APD, either the dark current must be subtracted from the solar signal or the modified 
algorithm must be used. 
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Photon Counts 


Figure 1 Examples of photon count distributions derived from CPL measurements at 532 nm for (a) solar 
background signals, and (b) laser scattering signals mixed with solar background signals. In both 
examples, the photon counts that comprise the input data were accumulated over an interval of 0. 1 ms. 
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Figure 2 Examples of uncertainty estimates in attenuated backscatter (m'sr 1 ) derived from airborne lidar 
measurements using photon counting detection: standard deviations computed for each altitude bin using 
100 consecutive profiles (conventional method) and using the NSF. The uncertainties computed using the 
conventional method are generally consistent with those derived using the NSF in the aerosol- free region 
(above ~1.5 km) where the atmospheric is relatively stable. Flowever, due to the horizontal variability of 
the aerosol layer, the conventional method is seen to significantly overestimate the uncertainties below -1.5 
km in the profile. 
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Figure 4 NSF calculations using LITE orbit 1 17 data acquired at 532-nm: (a) standard deviation and square 
root of the background signals, computed using the uppermost 2500 samples of each single-shot profde; 
and (b) NSF computed using Eq. (15). The arrows indicate daytime and nighttime portions of the orbit. 

All calculations are derived from data acquired using a photomultiplier (PMT). 
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Figure 5 NSF calculations using the orbit 117 data acquired at 1064-nm: (a) the square-root and RMS 
noise of the background signal, computed over the same altitude regime used in Figure 4; (b) NSF 
computed using Eq. (15); and (c) NSF computed using Eq. 16 (pale gray line) and Eq. 17 with c =12490 
(black line). All calculations are derived from data acquired using an avalanche photodiode (APD). The 
data segment displayed is identical to that shown in Figure 4. 
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Figure 6 (a) Autocorrelation function derived from uppermost 2500 samples and averaged over 6000 
profiles from the LITE orbit 117 measurement, (b) Standard deviations as a function of average bin number 
Nun from the measurement and predicted using (A^m) 12 - (c) Correlation correction function. 
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